Support Vector Machines based Arabic Language Text Classification System: Feature Selection Comparative Study
نویسنده
چکیده
Feature selection is essential for effective and accurate text classification systems. This paper investigates the effectiveness of six commonly used feature selection methods, Evaluation used an in-house collected Arabic text classification corpus, and classification is based on Support Vector Machine Classifier. The experimental results are presented in terms of precision, recall and Macroaveraged F1 measure. Key-Words: SVM, Feature Selection, Information Gain, CHI, Odd Ratio, GSS, NGL, Mutual Information, Arabic Text Classification, Arabic Text Categorization.
منابع مشابه
Chi Square Feature Extraction Based Svms Arabic Language Text Categorization System
This paper aims to implement a Support Vector Machines (SVMs) based text classification system for Arabic language articles. This classifier uses CHI square method as a feature selection method in the pre-processing step of the Text Classification system design procedure. Comparing to other classification methods, our system shows a high classification effectiveness for Arabic data set in term ...
متن کاملArabic Language Text Classification Using Dependency Syntax-Based Feature Selection
We study the performance of Arabic text classification combining various techniques: (a) tfidf vs. dependency syntax, for feature selection and weighting; (b) class association rules vs. support vector machines, for classification. The Arabic text is used in two forms: rootified and lightly stemmed. The results we obtain show that lightly stemmed text leads to better performance than rootified ...
متن کاملAn Improvement in Support Vector Machines Algorithm with Imperialism Competitive Algorithm for Text Documents Classification
Due to the exponential growth of electronic texts, their organization and management requires a tool to provide information and data in search of users in the shortest possible time. Thus, classification methods have become very important in recent years. In natural language processing and especially text processing, one of the most basic tasks is automatic text classification. Moreover, text ...
متن کاملA Comparative Study of Gender and Age Classification in Speech Signals
Accurate gender classification is useful in speech and speaker recognition as well as speech emotion classification, because a better performance has been reported when separate acoustic models are employed for males and females. Gender classification is also apparent in face recognition, video summarization, human-robot interaction, etc. Although gender classification is rather mature in a...
متن کاملA Comparative Study on Arabic Text Classification
This paper focuses on Automatic Arabic classifications. Arabic language is highly inflectional and derivational language which makes text mining a complex task. In classifying Arabic text, there are many published experimental results. Since these results came from different datasets, authors and evaluation metrics, we cannot compare the performance of the experimented classifiers. In this pape...
متن کامل